3,934 research outputs found

    Single View Modeling and View Synthesis

    Get PDF
    This thesis develops new algorithms to produce 3D content from a single camera. Today, amateurs can use hand-held camcorders to capture and display the 3D world in 2D, using mature technologies. However, there is always a strong desire to record and re-explore the 3D world in 3D. To achieve this goal, current approaches usually make use of a camera array, which suffers from tedious setup and calibration processes, as well as lack of portability, limiting its application to lab experiments. In this thesis, I try to produce the 3D contents using a single camera, making it as simple as shooting pictures. It requires a new front end capturing device rather than a regular camcorder, as well as more sophisticated algorithms. First, in order to capture the highly detailed object surfaces, I designed and developed a depth camera based on a novel technique called light fall-off stereo (LFS). The LFS depth camera outputs color+depth image sequences and achieves 30 fps, which is necessary for capturing dynamic scenes. Based on the output color+depth images, I developed a new approach that builds 3D models of dynamic and deformable objects. While the camera can only capture part of a whole object at any instance, partial surfaces are assembled together to form a complete 3D model by a novel warping algorithm. Inspired by the success of single view 3D modeling, I extended my exploration into 2D-3D video conversion that does not utilize a depth camera. I developed a semi-automatic system that converts monocular videos into stereoscopic videos, via view synthesis. It combines motion analysis with user interaction, aiming to transfer as much depth inferring work from the user to the computer. I developed two new methods that analyze the optical flow in order to provide additional qualitative depth constraints. The automatically extracted depth information is presented in the user interface to assist with user labeling work. In this thesis, I developed new algorithms to produce 3D contents from a single camera. Depending on the input data, my algorithm can build high fidelity 3D models for dynamic and deformable objects if depth maps are provided. Otherwise, it can turn the video clips into stereoscopic video

    Phase Retrieval with Random Phase Illumination

    Full text link
    This paper presents a detailed, numerical study on the performance of the standard phasing algorithms with random phase illumination (RPI). Phasing with high resolution RPI and the oversampling ratio σ=4\sigma=4 determines a unique phasing solution up to a global phase factor. Under this condition, the standard phasing algorithms converge rapidly to the true solution without stagnation. Excellent approximation is achieved after a small number of iterations, not just with high resolution but also low resolution RPI in the presence of additive as well multiplicative noises. It is shown that RPI with σ=2\sigma=2 is sufficient for phasing complex-valued images under a sector condition and σ=1\sigma=1 for phasing nonnegative images. The Error Reduction algorithm with RPI is proved to converge to the true solution under proper conditions

    Task Driven Generative Modeling for Unsupervised Domain Adaptation: Application to X-ray Image Segmentation

    Full text link
    Automatic parsing of anatomical objects in X-ray images is critical to many clinical applications in particular towards image-guided invention and workflow automation. Existing deep network models require a large amount of labeled data. However, obtaining accurate pixel-wise labeling in X-ray images relies heavily on skilled clinicians due to the large overlaps of anatomy and the complex texture patterns. On the other hand, organs in 3D CT scans preserve clearer structures as well as sharper boundaries and thus can be easily delineated. In this paper, we propose a novel model framework for learning automatic X-ray image parsing from labeled CT scans. Specifically, a Dense Image-to-Image network (DI2I) for multi-organ segmentation is first trained on X-ray like Digitally Reconstructed Radiographs (DRRs) rendered from 3D CT volumes. Then we introduce a Task Driven Generative Adversarial Network (TD-GAN) architecture to achieve simultaneous style transfer and parsing for unseen real X-ray images. TD-GAN consists of a modified cycle-GAN substructure for pixel-to-pixel translation between DRRs and X-ray images and an added module leveraging the pre-trained DI2I to enforce segmentation consistency. The TD-GAN framework is general and can be easily adapted to other learning tasks. In the numerical experiments, we validate the proposed model on 815 DRRs and 153 topograms. While the vanilla DI2I without any adaptation fails completely on segmenting the topograms, the proposed model does not require any topogram labels and is able to provide a promising average dice of 85% which achieves the same level accuracy of supervised training (88%)

    Stick-Breaking Policy Learning in Dec-POMDPs

    Get PDF
    Expectation maximization (EM) has recently been shown to be an efficient algorithm for learning finite-state controllers (FSCs) in large decentralized POMDPs (Dec-POMDPs). However, current methods use fixed-size FSCs and often converge to maxima that are far from optimal. This paper considers a variable-size FSC to represent the local policy of each agent. These variable-size FSCs are constructed using a stick-breaking prior, leading to a new framework called \emph{decentralized stick-breaking policy representation} (Dec-SBPR). This approach learns the controller parameters with a variational Bayesian algorithm without having to assume that the Dec-POMDP model is available. The performance of Dec-SBPR is demonstrated on several benchmark problems, showing that the algorithm scales to large problems while outperforming other state-of-the-art methods

    Comparative analysis of binocular summation of pattern visual evoked potential before and after the surgery of concomitant strabismus

    Get PDF
    AIM: To investigate the opportunity of the concomitant strabismus operation and the function in the treatment of strabismic amblyopia through analyzing the changes of binocular summation of pattern visual evoked potential(P-VEP)before and after the surgery of concomitant strabismus. <p>METHODS: In this retrospective study we investigated 67 cases admitted in our hospital. All patients were less than 18a and the postoperation squint angle was less than ±10<sup>△</sup>. Patients were divided into three groups according to the strabismus type, age, and amblyopia degree. P-VEP binocular summation response was recorded in all cases, to observe the changes of the binocular summation response of P-VEP before strabismus surgery and 1mo, 3mo after surgery. The P-VEP response of binocular /monocular(B/M)ratio was taken as an evaluation index. <p>RESULTS: B/M value of three groups all improved obviously 1mo after surgery, which the difference showed statistical significant(<i>P</i><0.01). 1)After 3mo surgery, B/M value in esotropia group was higher than that in exotropia group(<i>P</i><0.05). 2)After 3mo surgery, B/M value in ≤6a group was higher than that in >12a group(<i>P</i><0.05). 3)After 1mo surgery, B/M value in severe amblyopia group was higher than that in mild group(<i>P</i><0.05). After 3mo surgery, B/M value in severe amblyopia group was higher than that in mild group significantly(<i>P</i><0.01). <p>CONCLUSION: Concomitant strabismus surgery is suggested to be performed before 6 years old when the patients are difficult to improve the vision after amblyopia treatment, especially with the severe amblyopia and esotropia(accommodative esotropia must be excluded). The early operation is better to amblyopia treatment and binocular vision recovery
    • …
    corecore